Reducing Shading by Merging Fragments from the Adjacent Primitives

ABSTRACT

Instead of shading a triangle from the rasterizer as soon as it is known that there is a sample inside the triangle, in accordance with one embodiment, shading is delayed until the triangle beside it, called the neighboring triangle, is received. If there is a neighboring triangle facing the same way, with non-mutually exclusive coverage, meaning that it is not overlapping the same region, then the shader shades only once for the pair of triangles. That is, two separate fragments are merged and treated as one fragment. Specifically, the fragment that is over the pixel center is the one that is used and the other fragment is replaced by merging. The merger happens only over the extent of a pixel and more than one primitive is not shaded at a time. However, multiple merges within a 2×2 block of pixels are possible.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation application claiming priority to U.S.patent application Ser. No. 14/108,419 filed Dec. 17, 2013, herebyexpressly incorporated by reference herein.

BACKGROUND

This relates generally to graphics processing.

Conventionally, tessellated or non-tessellated triangles from therasterizer are sent to a pixel shader for shading. Whenever possible, adepth-stencil test may be performed before shading to avoid unnecessaryshading. In the pixel shader, color and texture may be applied to thosetriangles.

Generally the triangles are sent for shading in blocks of 2×2 pixelscalled a shading quad. The reason for this is that there are derivativesthat must be determined for mip map calculations that involvecalculating finite differences in x and y directions. Information from agroup of pixels is used to calculate the derivatives.

Even if a triangle touches just one pixel, the shader still ends upusing at least two other pixels so that the derivatives can bedetermined. The shading quad is static; it always is in the same screenposition regardless of how triangles land on the screen.

In multi-sampled anti-aliasing (MSAA), the rasterizer typically tests atevery sample location, whether the sample location is inside thetriangle being rasterized or not. If the sample is inside the triangle,then the entire 2×2 quad is shaded.

For example 8× multi-sampled anti-aliasing shades eight samples the sameway. If all eight samples are covered by the triangle, only the pixelcenter is shaded and all eight samples get that color. This is true evenif the triangle only hits one of the samples. Four pixels are stillshaded, using that color for only the one sample covered by thetriangle. Sometimes it is possible to shade at locations other thanpixel centers, e.g. centroid sampling.

BRIEF DESCRIPTION OF THE DRAWINGS

Some embodiments are described with respect to the following figures:

FIG. 1 is a depiction of a primitive on a shading quad;

FIG. 2 is a depiction of a second primitive on the same shading quad;

FIG. 3 is a depiction of a third primitive on the same shading quad;

FIG. 4 is a depiction of a fourth primitive on the same shading quad;

FIG. 5 is a depiction of a fifth primitive on the same shading quad;

FIG. 6 is a depiction of a sixth primitive on the same shading quad;

FIG. 7 is a schematic depiction for one embodiment;

FIG. 8 is a flow chart for one embodiment;

FIG. 9 is a system depiction for one embodiment; and

FIG. 10 is a front elevational view for one embodiment.

DETAILED DESCRIPTION

Instead of shading a triangle from the rasterizer as soon as it is knownthat there is a sample inside the triangle, in accordance with oneembodiment, shading is delayed until the triangle beside it, called theneighboring triangle, is received. If there is a neighboring trianglefacing the same way, with mutually exclusive coverage, meaning that itis not overlapping the same region, then the shader shades only once forthe pair of triangles. That is, two separate fragments are merged andtreated as one fragment. Specifically, the fragment that is over thepixel center is the one that is used and the other fragment is replacedby merging. The merger only happens for one pixel.

As used herein, a triangle is a typical example of a primitive, but theconcepts described herein apply to any polygon.

One benefit of merging per pixel or only having one triangle being firedto the shader at a time, is that the calculation of the derivatives ismuch more accurate compared to merging so that more than one triangle isshaded at a time.

When more than one triangle is shaded at the same time, less accuracymay result because the derivatives must be changed for all thecontributing triangles. Moreover, the shading system must interpolateusing multiple triangles' edge equations.

In accordance with one embodiment, the primitive that does not cover thepixel center is not shaded and instead the samples receive the color ofthe neighboring fragment that does cover the pixel center. As a result,no derivatives are changed for the fragment covering pixel center.

In some embodiments, coarse pixel shading (also called decoupled pixelshading) may be advantageous. In coarse pixel shading, the shading ratemay change across the picture or frame. For example the shading rate maybe lower around the periphery of the frame and higher in the center, asone example.

In accordance with some embodiments, a merge buffer 12 is positionedbetween the rasterizer 10 and the pixel shader 14, as shown in FIG. 7,in a graphics pipeline. The merge buffer merges fragments fromneighboring connected primitives covering the same pixel and dispatchesthe shading for the 2×2 block corresponding to the fragment covering thepixel center. It then uses the shading results of that fragment for thefragment that did not cover the pixel center but came from the adjacentprimitive. In effect, this saves the shading of internal ornon-silhouette fragments that do not cover the pixel centers. Wheneverpossible, depth/stencil checks are done before shading and before mergebuffer 12, for “Early Depth/Stencil Test.”

Thus, some embodiments save shading only along the internal(non-silhouette) edges. As the triangle size gets smaller, the relativeshading along the internal edges forms a bigger percentage of the totalshading and the total bandwidth used. Thus the workload with smallertriangles may benefit more in some embodiments. The portion of thetriangle that covers at least one sample in a 2×2 region along with theshading inputs at the pixels' centers for those 2×2 pixels is called aquad fragment.

The merge buffer merges the fragments within a single pixel rather thana 2×2 pixel region. Thus the merge granularity is one pixel and isdecoupled from the shade granularity which is 2×2 pixels correspondingto a quad fragment.

Adjacency between two triangles may be tracked using an edge identifierin one embodiment. An edge identifier is generated using vertexidentifiers so the vertices form the edge. The identifier may be madegeneric by keeping the smaller identifier first and then the largeridentifier and a bit that is set if the edge orientation is in anopposite direction. The edge identifiers for the shared edge havedifferent orientation bits. Thus a pseudo code for the edge identifiersis as follows:

struct Edge {    uint smallerVertexID, largerVertexID;    boolorientation; }

A fragment is a portion of a primitive covering at least one samplewithin a pixel. The fragment keeps track of the coverage within thepixel, the orientation of the primitive (whether it is back facing orfront facing) and a pointer to the corresponding shading quad structure.The corresponding pseudo code for a fragment is as follows:

struct Fragment {    int x,y;    bool facing;    BITMASK coverage; //MULTISAMPLES number of bits    float z[MULTISAMPLES]    ShadingQuad*pQuad; }

A shading quad is a 2×2 blocks of pixels and the corresponding shadinginputs. The pseudo code for the shading quad is as follows:

struct ShadingQuad {    BITMASK shade_coverage; // 4 bits.   SHADER_INPUT shadeInputData[4]; }

A merge buffer stores fragments coming from partially covered quadfragments along with the outer edges of the contiguous primitivescontributing to the fragment either before or after the merge. When ncontiguous triangles are merged, one can have at most (n+2) outer edges,because each merge removes at least one shared edge and adds at the mosttwo more outer edges. Thus the buffer entry is as follows:

struct BufferEntry {    Fragment frag;    Edge outerEdges[N+2]; // N isconfigurable };

The criteria for merger is whether two primitives share a common edge,face the same way, have mutually exclusive pixel coverage and involveone and only one pixel. In the following pseudo code, the functionIsAdjacent( )checks to see if an incoming fragment shares an edge withan existing fragment:

bool can_merge (e1, e2)  {    return e1.frag.x == e2.frag.x && e1.frag.y== e2.frag.y && e1.frag.facing == e2.frag.facing && (e1.frag.coverage &e2.frag.coverage) == 0 && IsAdjacent (e1, e1); } // e1 is existingfragment and e2 is incoming fragment. bool IsAdjacent (Fragment e1,Fragment e2) {    foreach edge (e1.outerEdges[ ]) {       for eachtriEdge (e2.outerEdges [ ]) { if (IsShared (edge, triEdge))    returntrue;       }    }    return false; }

If the fragments can be merged, the two fragments are merged accordingto the following pseudo code:

// merge quad-fragment in entry e2 into e1 void merge (e1, e2) {select_shading_inputs (e1.frag, e2.frag); copy_z (e1.frag, e2.frag);e1.frag.coverage |= e2.frag.coverage; UpdateOuterEdges (e1); }

The function UpdateOuterEdges (Fragment f) updates the outer edges ofthe fragment f after deleting the common shared edge(s).

Consider an example of a 2×2 region of a render target where individualpixels are marked as a, b, c and d in FIGS. 1-6. The illustrativetriangles are indicated by triangle identifiers of the vertices andsubmitted in the order 012, 132, 143, 453, 352, and 562. Samplelocations are shown in black circles in FIGS. 1-6 and pixel centers areshown as open circles. The merge buffer and a buffer that stores theshading quads, titled “shading quads,” are also shown in FIGS. 1-6.Fragments are named corresponding to the pixel they cover and the suffixindicates the outer boundary of the polygon. The outer boundary of thepolygon is maintained as a set of oriented edges with edge identifiers.When a new primitive covers a pixel, each of its edges are compared withthe edges of the polygon's boundary to see if the polygon can be merged.

FIG. 1 shows the primitive 012 covering a pixel c generating a fragmentc₀₁₂ and the shading quad SQ₀₁₂. SQ₀₁₂ generates a shading quad byextrapolating the attributes using the plane equation of triangle 012.

When the next primitive 132 gets rasterized, two fragments a₁₃₂ and c₁₃₂both point to the same shading quad SQ₁₃₂ are generated. However,because c₁₃₂ and c₀₁₂ share a common edge (1,2), are facing the same wayand their coverage over pixel c is mutually exclusive, they are mergedinto a single fragment c₀₁₃₂ as shown in FIG. 2. This single mergedfragment points to the shading quad SQ₀₁₂ because the fragment c₀₁₂ (oneof the merge candidates) covered the pixel center and pointed to SQ₀₁₂.

FIG. 3 illustrates the events that happen when primitive 143 getsrasterized and merged. Because the merged fragment a₁₄₃₂ starts pointingto a different shading quad (SQ₁₄₃) and there is no fragment that pointsto SQ₁₃₂, a reference counting mechanism deletes SQ₁₃₂ and marks thatentry in the shading quad buffer as available.

FIGS. 4-6 show the merge process after successive primitives 345, 352,and 562 are rasterized and sent to the merge buffer.

As a result of merges, the number of shading requests have been reducedto 4 compared to 6. Shading quads are dispatched for shading when thefragments pointing to them have full coverage or the shading quad bufferis full or when an overlapping fragment arrives.

Referring to FIG. 8, a merge buffer sequence 12 may be implemented insoftware, firmware and/or hardware. In software and firmwareembodiments, it may be implemented by computer executed instructionsstored in one or more non-transitory computer readable media, such asmagnetic, optical, or semiconductor storages. For example, the sequencemay be implemented by the merge buffer 12 of FIG. 7.

The merge buffer sequence 12 begins by receiving two fragments fromneighboring primitives, as indicated in block 16. A check at diamond 18determines whether the fragments share a common edge, face the same way,have mutually exclusive pixel coverage, and relate to the same pixel.

If so, the fragments are merged and point to that fragment that coversthe pixel's center, as indicated in block 20. Then, a check at diamond22 determines whether the merged fragment points to a new shading quadand no fragment points to the old shading quad. If so, the old shadingquad is deleted, as indicated in block 24, and, in either case, the flowends.

Coarse pixel shading may reduce power consumption by reducing theshading rate to less than one pixel, while keeping the visibility thesame. The techniques described here are readily applicable to coarsepixel shading. This can be achieved by tracking one additional bit perentry in the merge buffer and the shading quad buffer that tracks theshading rate. No merge may happen across the shading rates in oneembodiment. Coarse pixel shading has the effect of making trianglessmaller in size. Scenes with smaller triangles are more likely tobenefit from merging than ones with larger triangles.

In accordance with some embodiments, register pressure may be reducedcompared to shading up to four different primitives in one batch sincethe shading system does not have to handle multiple primitives in onesingle batch. In addition, derivatives are not changed for allcontributing fragments, only the shaded colors of neighboring samples(within that pixel) are used. In schemes that shade fragments for morethan one primitive, the system may use incorrect derivatives for all thecontributing fragments.

Even though some embodiments are described in the context ofuntessellated triangle meshes, a similar scheme may be employed fortessellated meshes. One can use the patch barycentriccoordinates((u,v)-s or (u,v,w)-s) at the tessellated vertices to trackthe edge identifiers for the edges that are internal to the patch. Foredges along the edges, one can use the combination of parameter valuesand corner identifiers to establish the adjacency during the merge.

Another possible implementation may choose the triangle that has thelargest coverage within a pixel, instead of chosing the one that coversthe pixel center.

The graphics processing techniques described herein may be implementedin various hardware architectures. For example, graphics functionalitymay be integrated within a chipset. Alternatively, a discrete graphicsprocessor may be used. As still another embodiment, the graphicsfunctions may be implemented by a general purpose processor, including amulticore processor.

FIG. 9 illustrates an embodiment of a system 700. In embodiments, system700 may be a media system although system 700 is not limited to thiscontext. For example, system 700 may be incorporated into a personalcomputer (PC), laptop computer, ultra-laptop computer, tablet, touchpad, portable computer, handheld computer, palmtop computer, personaldigital assistant (PDA), cellular telephone, combination cellulartelephone/PDA, television, smart device (e.g., smart phone, smart tabletor smart television), mobile internet device (MID), messaging device,data communication device, and so forth.

In embodiments, system 700 comprises a platform 702 coupled to a display720. Platform 702 may receive content from a content device such ascontent services device(s) 730 or content delivery device(s) 740 orother similar content sources. A navigation controller 750 comprisingone or more navigation features may be used to interact with, forexample, platform 702 and/or display 720. Each of these components isdescribed in more detail below.

In embodiments, platform 702 may comprise any combination of a chipset705, processor 710, memory 712, storage 714, graphics subsystem 715,applications 716 and/or radio 718. Chipset 705 may provideintercommunication among processor 710, memory 712, storage 714,graphics subsystem 715, applications 716 and/or radio 718. For example,chipset 705 may include a storage adapter (not depicted) capable ofproviding intercommunication with storage 714.

Processor 710 may be implemented as Complex Instruction Set Computer(CISC) or Reduced Instruction Set Computer (RISC) processors, x86instruction set compatible processors, multi-core, or any othermicroprocessor or central processing unit (CPU). In embodiments,processor 710 may comprise dual-core processor(s), dual-core mobileprocessor(s), and so forth. The processor may implement the sequence ofFIG. 8, together with memory 712.

Memory 712 may be implemented as a volatile memory device such as, butnot limited to, a Random Access Memory (RAM), Dynamic Random AccessMemory (DRAM), or Static RAM (SRAM).

Storage 714 may be implemented as a non-volatile storage device such as,but not limited to, a magnetic disk drive, optical disk drive, tapedrive, an internal storage device, an attached storage device, flashmemory, battery backed-up SDRAM (synchronous DRAM), and/or a networkaccessible storage device. In embodiments, storage 714 may comprisetechnology to increase the storage performance enhanced protection forvaluable digital media when multiple hard drives are included, forexample.

Graphics subsystem 715 may perform processing of images such as still orvideo for display. Graphics subsystem 715 may be a graphics processingunit (GPU) or a visual processing unit (VPU), for example. An analog ordigital interface may be used to communicatively couple graphicssubsystem 715 and display 720. For example, the interface may be any ofa High-Definition Multimedia Interface, DisplayPort, wireless HDMI,and/or wireless HD compliant techniques. Graphics subsystem 715 could beintegrated into processor 710 or chipset 705. Graphics subsystem 715could be a stand-alone card communicatively coupled to chipset 705.

The graphics and/or video processing techniques described herein may beimplemented in various hardware architectures. For example, graphicsand/or video functionality may be integrated within a chipset.Alternatively, a discrete graphics and/or video processor may be used.As still another embodiment, the graphics and/or video functions may beimplemented by a general purpose processor, including a multi-coreprocessor. In a further embodiment, the functions may be implemented ina consumer electronics device.

Radio 718 may include one or more radios capable of transmitting andreceiving signals using various suitable wireless communicationstechniques. Such techniques may involve communications across one ormore wireless networks. Exemplary wireless networks include (but are notlimited to) wireless local area networks (WLANs), wireless personal areanetworks (WPANs), wireless metropolitan area network (WMANs), cellularnetworks, and satellite networks. In communicating across such networks,radio 718 may operate in accordance with one or more applicablestandards in any version.

In embodiments, display 720 may comprise any television type monitor ordisplay. Display 720 may comprise, for example, a computer displayscreen, touch screen display, video monitor, television-like device,and/or a television. Display 720 may be digital and/or analog. Inembodiments, display 720 may be a holographic display. Also, display 720may be a transparent surface that may receive a visual projection. Suchprojections may convey various forms of information, images, and/orobjects. For example, such projections may be a visual overlay for amobile augmented reality (MAR) application. Under the control of one ormore software applications 716, platform 702 may display user interface722 on display 720.

In embodiments, content services device(s) 730 may be hosted by anynational, international and/or independent service and thus accessibleto platform 702 via the Internet, for example. Content servicesdevice(s) 730 may be coupled to platform 702 and/or to display 720.Platform 702 and/or content services device(s) 730 may be coupled to anetwork 760 to communicate (e.g., send and/or receive) media informationto and from network 760. Content delivery device(s) 740 also may becoupled to platform 702 and/or to display 720.

In embodiments, content services device(s) 730 may comprise a cabletelevision box, personal computer, network, telephone, Internet enableddevices or appliance capable of delivering digital information and/orcontent, and any other similar device capable of unidirectionally orbidirectionally communicating content between content providers andplatform 702 and/display 720, via network 760 or directly. It will beappreciated that the content may be communicated unidirectionally and/orbidirectionally to and from any one of the components in system 700 anda content provider via network 760. Examples of content may include anymedia information including, for example, video, music, medical andgaming information, and so forth.

Content services device(s) 730 receives content such as cable televisionprogramming including media information, digital information, and/orother content. Examples of content providers may include any cable orsatellite television or radio or Internet content providers. Theprovided examples are not meant to limit embodiments of the disclosure.

In embodiments, platform 702 may receive control signals from navigationcontroller 750 having one or more navigation features. The navigationfeatures of controller 750 may be used to interact with user interface722, for example. In embodiments, navigation controller 750 may be apointing device that may be a computer hardware component (specificallyhuman interface device) that allows a user to input spatial (e.g.,continuous and multi-dimensional) data into a computer. Many systemssuch as graphical user interfaces (GUI), and televisions and monitorsallow the user to control and provide data to the computer or televisionusing physical gestures, facial expressions or sounds.

Movements of the navigation features of controller 750 may be echoed ona display (e.g., display 720) by movements of a pointer, cursor, focusring, or other visual indicators displayed on the display. For example,under the control of software applications 716, the navigation featureslocated on navigation controller 750 may be mapped to virtual navigationfeatures displayed on user interface 722, for example. In embodiments,controller 750 may not be a separate component but integrated intoplatform 702 and/or display 720. Embodiments, however, are not limitedto the elements or in the context shown or described herein.

In embodiments, drivers (not shown) may comprise technology to enableusers to instantly turn on and off platform 702 like a television withthe touch of a button after initial boot-up, when enabled, for example.Program logic may allow platform 702 to stream content to media adaptorsor other content services device(s) 730 or content delivery device(s)740 when the platform is turned “off.” In addition, chip set 705 maycomprise hardware and/or software support for 5.1 surround sound audioand/or high definition 7.1 surround sound audio, for example. Driversmay include a graphics driver for integrated graphics platforms. Inembodiments, the graphics driver may comprise a peripheral componentinterconnect (PCI) Express graphics card.

In various embodiments, any one or more of the components shown insystem 700 may be integrated. For example, platform 702 and contentservices device(s) 730 may be integrated, or platform 702 and contentdelivery device(s) 740 may be integrated, or platform 702, contentservices device(s) 730, and content delivery device(s) 740 may beintegrated, for example. In various embodiments, platform 702 anddisplay 720 may be an integrated unit. Display 720 and content servicedevice(s) 730 may be integrated, or display 720 and content deliverydevice(s) 740 may be integrated, for example. These examples are notmeant to limit the disclosure.

In various embodiments, system 700 may be implemented as a wirelesssystem, a wired system, or a combination of both. When implemented as awireless system, system 700 may include components and interfacessuitable for communicating over a wireless shared media, such as one ormore antennas, transmitters, receivers, transceivers, amplifiers,filters, control logic, and so forth. An example of wireless sharedmedia may include portions of a wireless spectrum, such as the RFspectrum and so forth. When implemented as a wired system, system 700may include components and interfaces suitable for communicating overwired communications media, such as input/output (I/O) adapters,physical connectors to connect the I/O adapter with a correspondingwired communications medium, a network interface card (NIC), disccontroller, video controller, audio controller, and so forth. Examplesof wired communications media may include a wire, cable, metal leads,printed circuit board (PCB), backplane, switch fabric, semiconductormaterial, twisted-pair wire, co-axial cable, fiber optics, and so forth.

Platform 702 may establish one or more logical or physical channels tocommunicate information. The information may include media informationand control information. Media information may refer to any datarepresenting content meant for a user. Examples of content may include,for example, data from a voice conversation, videoconference, streamingvideo, electronic mail (“email”) message, voice mail message,alphanumeric symbols, graphics, image, video, text and so forth. Datafrom a voice conversation may be, for example, speech information,silence periods, background noise, comfort noise, tones and so forth.Control information may refer to any data representing commands,instructions or control words meant for an automated system. Forexample, control information may be used to route media informationthrough a system, or instruct a node to process the media information ina predetermined manner. The embodiments, however, are not limited to theelements or in the context shown or described in FIG. 9.

As described above, system 700 may be embodied in varying physicalstyles or form factors. FIG. 10 illustrates embodiments of a small formfactor device 800 in which system 700 may be embodied. In embodiments,for example, device 800 may be implemented as a mobile computing devicehaving wireless capabilities. A mobile computing device may refer to anydevice having a processing system and a mobile power source or supply,such as one or more batteries, for example.

As described above, examples of a mobile computing device may include apersonal computer (PC), laptop computer, ultra-laptop computer, tablet,touch pad, portable computer, handheld computer, palmtop computer,personal digital assistant (PDA), cellular telephone, combinationcellular telephone/PDA, television, smart device (e.g., smart phone,smart tablet or smart television), mobile internet device (MID),messaging device, data communication device, and so forth.

Examples of a mobile computing device also may include computers thatare arranged to be worn by a person, such as a wrist computer, fingercomputer, ring computer, eyeglass computer, belt-clip computer, arm-bandcomputer, shoe computers, clothing computers, and other wearablecomputers. In embodiments, for example, a mobile computing device may beimplemented as a smart phone capable of executing computer applications,as well as voice communications and/or data communications. Althoughsome embodiments may be described with a mobile computing deviceimplemented as a smart phone by way of example, it may be appreciatedthat other embodiments may be implemented using other wireless mobilecomputing devices as well. The embodiments are not limited in thiscontext.

The processor 710 may communicate with a camera 722 and a globalpositioning system sensor 720, in some embodiments. A memory 712,coupled to the processor 710, may store computer readable instructionsfor implementing the sequence shown in FIG. 8 in software and/orfirmware embodiments.

As shown in FIG. 10, device 800 may comprise a housing 802, a display804, an input/output (I/O) device 806, and an antenna 808. Device 800also may comprise navigation features 812. Display 804 may comprise anysuitable display unit for displaying information appropriate for amobile computing device. I/O device 806 may comprise any suitable I/Odevice for entering information into a mobile computing device. Examplesfor I/O device 806 may include an alphanumeric keyboard, a numerickeypad, a touch pad, input keys, buttons, switches, rocker switches,microphones, speakers, voice recognition device and software, and soforth. Information also may be entered into device 800 by way ofmicrophone. Such information may be digitized by a voice recognitiondevice. The embodiments are not limited in this context.

Various embodiments may be implemented using hardware elements, softwareelements, or a combination of both. Examples of hardware elements mayinclude processors, microprocessors, circuits, circuit elements (e.g.,transistors, resistors, capacitors, inductors, and so forth), integratedcircuits, application specific integrated circuits (ASIC), programmablelogic devices (PLD), digital signal processors (DSP), field programmablegate array (FPGA), logic gates, registers, semiconductor device, chips,microchips, chip sets, and so forth. Examples of software may includesoftware components, programs, applications, computer programs,application programs, system programs, machine programs, operatingsystem software, middleware, firmware, software modules, routines,subroutines, functions, methods, procedures, software interfaces,application program interfaces (API), instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof. Determining whether an embodimentis implemented using hardware elements and/or software elements may varyin accordance with any number of factors, such as desired computationalrate, power levels, heat tolerances, processing cycle budget, input datarates, output data rates, memory resources, data bus speeds and otherdesign or performance constraints.

The following clauses and/or examples pertain to further embodiments:

One example embodiment may be a method comprising merging two fragmentsfrom neighboring primitives that cover the same pixel, and shading onlyone primitive at a time. The method may include delaying shading untilinformation about a neighboring primitive is available. The method mayinclude merging untessellated primitives. The method may include mergingin tessellated meshes. The method may include merging if the twofragments face the same way and have mutually exclusive coverage. Themethod may include only shading the fragment that covers the pixel'scenter. The method may include only shading the fragment with largestcoverage. The method may include using multi-sampled antialiasing. Themethod may include determining whether a merged fragment points to oneshading quad and no fragment points to another shading quad. The methodmay include deleting the another shading quad.

Another example embodiment may include one or more non-transitorycomputer readable media storing instructions executed by a processor toperform a method comprising merging two fragments from neighboringprimitives that cover the same pixel, and shading only one primitive ata time. The media may further store said method including delayingshading until information about a neighboring primitive is available.The media may further store said method including merging untessellatedprimitives. The media may further store said method including merging intessellated meshes. The media may further store said method includingmerging if the two fragments face the same way and have mutuallyexclusive coverage. The media may further store said method includingonly shading the fragment that covers the pixel's center. The media mayfurther store said method including only shading the fragment withlargest coverage. The media may further store said method includingusing multi-sampled antialiasing. The media may further store saidmethod including determining whether a merged fragment points to oneshading quad and no fragment points to another shading quad. The mediamay further store said method including, if so, deleting the anothershading quad.

In another example embodiment may be an apparatus comprising a processorto merge two fragments from neighboring primitives that cover the samepixel and to shade only one primitive at a time, and a storage coupledto said processor. The apparatus may include said processor to delayshading until information about a neighboring primitive is available.The apparatus may include said processor to merge untessellatedprimitives. The apparatus may include said processor to merge intessellated meshes. The apparatus may include said processor to merge ifthe two fragments face the same way and have mutually exclusivecoverage. The apparatus may include said processor to only shade thefragment that covers the pixel's center. The apparatus may include saidprocessor to only shade the fragment with largest coverage. Theapparatus may include an operating system, a battery and firmware and amodule to update said firmware.

References throughout this specification to “one embodiment” or “anembodiment” mean that a particular feature, structure, or characteristicdescribed in connection with the embodiment is included in at least oneimplementation encompassed within the present disclosure. Thus,appearances of the phrase “one embodiment” or “in an embodiment” are notnecessarily referring to the same embodiment. Furthermore, theparticular features, structures, or characteristics may be instituted inother suitable forms other than the particular embodiment illustratedand all such forms may be encompassed within the claims of the presentapplication.

While a limited number of embodiments have been described, those skilledin the art will appreciate numerous modifications and variationstherefrom. It is intended that the appended claims cover all suchmodifications and variations as fall within the true spirit and scope ofthis disclosure.

1-30. (canceled)
 31. One or more non-transitory computer readable mediastoring instructions executed by a processor to perform a sequencecomprising: detecting that a non-silhouette primitive shares an edgewith a adjacent second primitive; testing whether the first and secondprimitives are overlapping; and merging the first and second primitivesthat share an edge.
 32. The media of claim 1 further storinginstructions to perform a sequence including merging the first andsecond primitives if they are not overlapping.
 33. The media of claim 2further storing instructions to perform a sequence including using avertex array data to determine if the first and second primitives have ashared edge.
 34. An apparatus comprising: a processor to detect that anon-silhouette primitive shares an edge with a adjacent secondprimitive, test whether the first and second primitives are overlapping,and merge the first and second primitives that share an edge; and amemory coupled to said processor.
 35. The apparatus of claim 34, saidprocessor to merge the first and second primitives if they are notoverlapping.
 36. The apparatus of claim 35, said processor to use avertex array data to determine if the first and second primitives have ashared edge.
 37. A system comprising: a processor to detect that anon-silhouette primitive shares an edge with a adjacent secondprimitive, test whether the first and second primitives are overlapping,and merge the first and second primitives that share an edge; and adisplay coupled to said processor.
 38. The system of claim 37, saidprocessor to merge the first and second primitives if they are notoverlapping.
 39. The system of claim 38, said processor to use a vertexarray data to determine if the first and second primitives have a sharededge.